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(57) A device and method for performing SISO (Soft 
In-Soft Out) decoding, particularly for turbo decoders, 
moving the forward-backward decoding approximation 
of MAP (Maximum a Posteriori probability) decoding. 
The method comprises the steps of: (a) providing a trel- 
lis representative of an output of a convolutional encod- 
er, the convolutional encoder has a coding rate of R, the 
trellis having a block length T. (b) assigning an initial con- 
ditions to each starting node of the trellis for a forward 
iteration through the trellis, (c) computing a forward met- 
ric for each node, starting from the start of the trellis and 
advancing forward through the trellis and storing for- 
ward metrics of nodes of a plurality of starting stages of 
windows, (d) repeating stages d(1 )-d(3) until all lambdas 
of the trellis are calculated; d(1 ) retrieving forward met- 
rics of nodes of a starting stage of a window, the re- 
trieved forward metrics were computed and stored dur- 
ing step (c). d(2) computing and storing forward metrics 
for each node, starting from a second stage of the win- 
dow and ending at the ending stage of the window. d(3) 
computing backwards metrics for each node, starting 
from the ending stage of the window and ending at the 
starting stage of the window; wherein when backward 
metrics of nodes of a stage are computed and the for- 
ward metrics of the nodes of an adjacent stage were 
previously computed, the computation of backward met- 
rics is integrated with the computation of lambda from 
the stage of the adjacent stage and a storage of the clac- 
ulated lambdas in order to accelerate the decoding op- 
eration. 
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Description 

Field of the invention 

5 [0001] Apparatus and method for performing Soft Input -Soft Output decoding, and especially an apparatus and 
method for performing log Map and max-log-map algorithms. 

BACKGROUND OF THE INVENTION 

w [0002] Turbo Coding (i.e.- TC) is used for error control coding in digital communications and signal processing. The 
following references give some examples of various implementations of the TC: "Near Shannon limit error correcting 
coding and decoding: turbo-codes", by Berrou, Glavieux, Thitimajshima, IEEE International Conference of Communi- 
cation. Geneva Switzerland, pp. 1064-1070, May 1993; "Implementation and Performance of a Turbo/MAP Decoder", 
Pietrobon, International Journal of Satellite Communication; "Turbo Coding", Heegard and Wicker, Kluwer Academic 

15 Publishers 1999. 

[0003] MAP algorithm and soft output Viterbi algorithm (SOVA) are Soft Input Soft Output (i.e.- SISO) decoding 
algorithms that have gained wide acceptance in the area of communications. Both algorithms are mentioned in U.S 
patent 5,933,462 of Viterbi et at. 

[0004] The TC has gained wide acceptance in the area of communications, such as in cellular networks, modems, 
20 and satellite communications. Some turbo encoders consists of two parallel-concatenated systematic convolutional 
encoders separated by a random interleaves A turbo decoder has two soft-in soft-out (SISO) decoders. The output of 
the first SISO is coupled to the input of the second SISO via a first interleaves while the output of the second SISO is 
coupled to an input of the first SISO via a feedback loop that includes a deinterleaver, 

[0005] A common SISO decoder uses either a maximum a posteriori (i.e.- MAP) decoding algorithm or a Log MAP 
25 decoding algorithm. The latter algorithm is analogues to the former algorithm but is performed in the logarithmic domain. 
Another common decoding algorithm is the max log MAP algorithm. The log MAP is analogues to the log MAP but the 
implementation of the former involves an addition of correction factor. Briefly, the MAP finds the most likely information 
bit to have been transmitted in a coded sequence. 

[0006] The output signals of a convolutional encoder are transmitted via a channel and are received by a receiver 

30 that has a turbo decoder. The channel usually adds noise to the transmitted signal. 

[0007] During the decoding process a trellis of the possible states of the coding is defined. The trellis includes a 
plurality of nodes (states), organized in T stages, each stage has N=2sup(K-1) nodes, whereas T being the number 
of received samples taken into account for evaluating which bit was transmitted from a transmitter having the convo- 
lutional encoder and K is the constraint length of the code used for encoding. Each stage is comprised of states that 

35 represent a given time. Each state is characterized by a forward state metric, commonly referred to as alpha (a or a) 
and by a backward state metric, commonly referred to as beta (P or b). Each transition from a state to another state is 
characterized by a branch metric, commonly referred to as gamma (y). 

[0008] Alphas, betas and gammas are used to evaluate a probability factor that indicates which signal was transmit- 
ted. This probability factor is commonly known as lambda (A). A transition from a stage to an adjacent stage is repre- 

40 sented by a single lambda. 

[0009] The articles mentioned above describe prior art methods for performing MAP algorithm, these prior art meth- 
ods comprise of three steps. During the first step the alphas that are associated with all the trellis states are calculated, 
starting with the states of the first level of depth and moving forward. During the second step the betas associated with 
all the trellis states are calculated, starting with the states of the L'th level of depth and moving backwards. Usually, 

<5 while betas are calculated the lambdas can also be calculated. Usually, the gammas are calculated during or even 
before the first step. 

[0010] The TC can be implemented in hardware or in software. When implemented in hardware, the TC will generally 
run much faster than the TC implemented in software. However, implementing the TC in hardware is more expensive 
in terms of semiconductor surface area, complexity, and cost. 
50 [0011] Calculating the lambdas of the whole trellis is very memory intensive. A very large number of alphas, betas 
and gammas must be stored. 

[0012] Another prior art method is described in U.S patent 5,933,462 of Viterbi. This patent describes a soft decision 
output decoder for decoding convolutionally encoded code words. The decoder is based upon "generalized" Viterbi 
decoders and a dual maxima processor. The decoder has various drawbacks, such as, but not limited to the following 
55 drawbacks: The decoder either has a single backward decoder or two backward decoders. In both cases, and especially 
in the case of a decoder with one backward decoder, the decoder is relatively time consuming. In both cases, a learning 
period L equals a window W in which valid results are provided by backward decoder and forward decoder. Usually, 
L<W and the decoder described in U.S. patent 5,933,462 is not effective. Furthermore, at the end of the learning period 
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an estimation of either a forward metric or backward metric are provided. Calculations that are based upon these 
estimations, such as the calculations of forward metrics, backward metrics and lambdas are less accurate than calcu- 
lations that are based upon exact calculations of these variables. 

[0013] The decoder described in U.S. patent 5,933,462 is limited to calculate state metrics of nodes over a window 
5 having a length of 2L, where L is a number of constraint lengths, 2L is smaller than block length T of the trellis. 

[0014] There is a need to provide an improved device and method for performing high- accuracy SISO decoding that 
is not memory intensive. There is a need to provide a fast method for performing SISO decoding and provide an 
accelerating system for enhancing the performances of embedded systems. 

10 BRIEF DESCRIPTION OF THE DRAWINGS 

[001 5] While the invention is pointed out with particularity in the appended claims, other features of the invention are 
disclosed by the following detailed description taken in conjunction with the accompanying drawings, in which: 

*5 FIGS. 1 -2 illustrates in flow chart form, two methods for performing SISO decoding, in accordance with a preferred 

embodiment of the present invention; 

FIGS. 3-4 are schematic descriptions of systems for implementing the methods shown in FIGS. 1 and 2; 
FIG. 5 is a schematic description of a system for decoding a sequence of signals output by a convolutional encoder 
and transmitted over a channel according to a preferred embodiment of the invention; and 
20 FIG. 6 is a detailed description of a system for decoding a sequence of signals output by a convolutional encoder 

and transmitted over a channel according to a preferred embodiment of the invention. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

25 [0016] It should be noted that the particular terms and expressions employed and the particular structural and op- 
erational details disclosed in the detailed description and accompanying drawings are for illustrative purposes only 
and are not intended to in any way limit the scope of the invention as described in the appended claims. 
[0017] The invention provides an improved device and method for performing high-accuracy SISO decoding that is 
not memory intensive. The trellis is divided to a plurality of windows. Accurate alphas (betas) and gammas are calcu- 

30 fated during a first step in which alphas of a whole trellis are calculated. During this step a plurality of alphas of nodes 
of starting stages of windows (betas of nodes of ending stages of windows) are stored. During another steps the alphas 
(betas) betas (alphas) and gammas are calculated and stored in a fast internal memory module. These calculated 
values are used in another step of calculating accurate betas (alphas) and accurate lambdas of a window, and providing 
the lambdas to an external memory. The internal memory stores a plurality of variables that are required to calculate 

35 the alphas, betas gamma and lambdas of a window that is much smaller than the whole trellis. 

[0018] The invention provides an accelerating system that for enhancing the performances of embedded systems. 
The system has an internal memory and processors that can access an external memory and exchange information 
from a host processor or another embedded system and calculate lambdas by itself. 

[0019] FIG. 1 is a simplified flow chart diagram illustrating method 30 of the present invention. Preferably, method 
40 30 comprises steps 32, 34, 36, and 40, step 40 further comprising steps 42, 44, and 46, all steps illustrated by blocks. 
Solid lines 33, 35, 37, 41 , 43 and 45, coupling the steps indicate a preferred method flow. Method 30 requires that only 
a portion of the variables associated to the lambda calculations are stored in an internal memory. The method is fast 
and does not require a learning period. 

[0020] Method 30 starts in step 32 of providing a trellis representative of an output of a convolutional encoder, the 
45 convolutional encoder has a coding rate of R, the trellis having a block length T. The trellis is divided to a plurality of 
windows. The provision of the trellis involves receiving and storing a plurality of signals, such as parity bits Yp1 ,k Yp2,k, 
representing T transmitted symbols. 

[0021] Step 32 is followed by step 34 of assigning initial conditions to each node of the starting stage and the ending 
stage of the trellis. 

so [0022] Step 34 is followed by step 36 of computing a forward metric for each node, starting from the start of the trellis 
and advancing forward through the trellis and storing forward metrics of nodes of a plurality starting stages of windows. 
Preferably, the forward metrics of nodes of the starting stages of windows are stored in an external memory module. 
[0023] Step 36 is followed by step 40 of computing lambdas. Step 40 conveniently comprises of steps 42, 44 and 
46. Steps 42-46 are repeated until the all lambdas associated to the trellis are calculated. 

55 [0024] During step 42 retrieving forward metrics of nodes of a starting stage of a window, the retrieved forward metrics 
were computed and stored during step 36. Conveniently, the windows are selected so that the backward metrics of 
the nodes of the ending stage can be calculated in a swift and in an exact manner. Usually, the computation of backward 
metrics of nodes of an ending stages of a window that is not a last window in the trellis, is preceded by a computation 
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of a backward metrics of a following window, wherein the starting stage of the following window follows the ending 
stage of the window. Preferably, during a first iteration of steps 42-46 the lambdas of the last window of the trellis are 
calculated and further iterations are used to calculate lambdas of preceding windows. 

[0025] Step 42 is followed by step 44 of computing and storing forward metrics for each node, starting from a second 
5 stage of the window and ending at the ending stage of the window. Preferably, the forward metrics are stored in an 
internal memory module. 

[0026] Step 44 is followed by step 46 of computing backward metrics for each node, starting from the ending stage 
of the window and ending at the starting stage of the window; wherein when backward metrics of nodes of a stage are 
computed and the forward metrics of the nodes of an adjacent stage were previously computed, the computation of 
10 backward metrics is integrated with the computation of lambda from the stage to the adjacent stage. After a lambda 
is calculated it is stored. Preferably it is stored in an external memory module. As indicated by path 41 , step 46 is 
followed by step 42 until ail lambdas of the trellis are calculated and stored. 

[0027] Conveniently, all windows have the same length WN, wherein WN is much smaller than T. The windows do 
not overlap. Preferably, step 40 starts by calculating lambdas of the last window of the trellis and advances backward 
15 through the trellis. 

[0028] Preferably, method 40 is used to implement the Log MAP algorithms. Conveniently, gammas are calculated 
during steps 34 and 44. 

[0029] FIG. 2 is a simplified flow chart diagram illustrating method 50 of the present invention. Preferably, method 
50 comprises steps 52, 54, 56, and 60, step 60 further comprising steps 62, 64 and 66, all steps illustrated by blocks. 

20 Solid lines 53, 55, 57, 61 , 63, and 65, coupling the steps indicate a preferred method flow. Method 50 requires that 
only a portion of the trellis is stored in an internal memory module. It is fast and does not require a learning period. 
[0030] Method 50 starts in step 52 of providing a trellis representative of an output of a convolutional encoder, the 
convolutional encoder has a coding rate of R, the trellis having a block length T and is divided to windows. 
[0031] Step 52 is followed by step 54 of assigning an initial conditions to each node of the ending stage and starting 

25 stage of the trellis. 

[0032] Step 54 is followed by step 56 of computing a backward metric for each node, starting from the end of the 
trellis and advancing backward through the trellis and storing backward metrics of nodes of a plurality of ending stages 
of windows. Preferably, the backward metrics of the nodes of the ending stages of windows are stored in an external 
memory module. 

30 [0033] Step 56 is followed by step 60 of computing lambdas. Step 60 conveniently comprises of steps 62, 64, and 
66. Steps 62-66 are repeated until the all lambdas of the trellis are calculated. 

[0034] During step 62 retrieving backward metrics of nodes of a starting stage of a window, the retrieved backward 
metrics were computed and stored during step 66. Conveniently, the windows are selected so that the forward metrics 
of the nodes of the starting stage can be calculated in a swift and in an exact manner. Usually, the computation of 

35 forward metrics of nodes of an starting stages of a window that is not a first window in the trellis, is preceded by a 
computation of a forward metrics of a preceding window, wherein the ending stage of the preceding window is followed 
by the starting stage of the window. Preferably, during a first iteration of steps 62-66 the lambdas of the first window 
of the trellis are calculated and further iterations are used to calculate the lambdas of consecutive windows. 
[0035] Step 62 is followed by step 64 of computing and storing backward metrics for each node, starting from the 

40 stage that precedes the last stage of the window and ending at the starting stage of the window. Preferably, the backward 
metrics are stored in an internal memory module. 

[0036] Step 64 is followed by step 66 of computing forward metrics for each node, starting from the starting stage 
of the window and ending at the ending stage of the window; wherein when forward metrics of nodes of a stage are 
computed and the backward metrics of the nodes of an adjacent stage were previously computed, the computation of 
45 forward metrics is integrated with the computation of lambda from the stage to the adjacent stage. After a lambda is 
calculated it is stored. Preferably it is stored in an external memory module. As indicated by path 61 , step 66 is followed 
by step 62 until all lambdas of the treltis are calculated and stored. 

[0037] Conveniently, all windows have the same length WN, wherein WN is much smaller than T. The windows do 
not overlap. Preferably, step 60 starts by calculating lambdas of the first window of the trellis and advances forward 
50 through the trellis. 

[0038] Preferably, method 50 is used to implement the Log MAP algorithms. Conveniently, gammas are calculated 
during steps 54 and 64. 

[0039] FIGS. 3-4 are schematic descriptions of system 70 and 80 for implementing methods 30 and 50. System 70 
comprising external memory module 71; processor 72 that is coupled to external memory 71 via data bus 712 and 
55 control and address bus 711; internal memory module 75 coupled to processor 72 via bus 751 . System 80 is analogues 
to system 70 but instead of processor 72 has forward processor 73, gamma processor 76, backward processor 74, 
soft decision processor 77, host processor 79 and optional control unit 78. Conveniently, system 80 further comprises 
of control unit 78, for coordinating the calculations of various variables and the access to internal and external memory 
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modules 71 and 75. A man who is skilled in the art will appreciate that the calculations of alphas, betas, gammas and 
lambdas can be made by various processors and various configurations of processors. 

[0040] Host processor 79 is coupled to external memory module 71 via address and control bus 711 and data bus 
712 and is coupled to control unit 78 via control bus 791. Internal memory module 75 is coupled to control unit 78 via 

5 bus 781 and to soft decision processor 77, gamma processor 76, backward processor 74 and forward processor 73 
via internal buses 751. Forward processor 73 and backward processor 74 are also coupled to bus 712. 
[0041] Internal memory module 75 is adapted to store variables that are required during steps 40 or 60, such as the 
forward metrics of nodes of a window, the backward metrics of nodes of a window, the gammas of a window. External 
memory module 71 is adapted to store information that is associated with the whole trellis. The information can comprise 

10 of a set of systematic input signals Ys, two sets of either parity input signals Yp1 , Yp2 or parity input signals Yp3, Yp4, 
and a set of a-priory lambdas L. 

[0042] Forward processor 73 is adapted to fetch information, such as a plurality of input signals from external memory 
module 75 and to compute forward metrics (alphas). Backward processor 74 is adapted to fetch information, such as 
a plurality of received signals from external memory module 75 and to compute backward metrics (betas). 
15 [0043] Control unit 78 allows forward processor 73 to access external memory module 71 during step 36 and internal 
memory module 75 during steps 44, allows backward processor 74 to access external memory module 75 during step 
56 and internal memory module 75 during steps 64. 

[0044] Soft decision processor 77 is adapted to access internal memory module 75, gamma processor 76, backward 

processor 74 and forward processor 73 to receive forward metrics and backward metrics and gammas and to calculate 
20 lambdas during steps 46 and 66. These lambdas are further sent to external memory unit during steps 46 and 66. 

[0045] FIG. 5 is a schematic descriptions of system 90 for decoding a sequence of signals output by a convolutional 

encoder and transmitted over a channel according to a preferred embodiment of the invention. 

[0046] System 90 is coupled to a host processor 79 and external memory module 71 via buses 791 , 711 and 712. 

[0047] System 90 comprising of gamma processor 76, gamma register file 82, internal memory module 75, abc 
25 processor 722 and processor register file 721. Abc processor 72 is coupled to gamma register file 82 , to processor 

register file 721 and to internal memory module via buses 821 , 791 and 751 . Gamma processor 76 is coupled to gamma 

register file 82 via bus 822 and to internal memory module 75 via bus 761 . 

[0048] Gamma processor 76 and processor register file 721 are coupled to bus 712 for receiving initial conditions 
and input signals, and for providing alphas during step 36 or betas during step 56. 
30 [0049] Gamma register file 82 is used to store gammas. Processor register file 79 is used to store alphas and betas 
that are calculated by abc processor 722 and to store intermediate variables and results that are required for calculating 
alphas, betas and lambdas. An exemplary implementation of system 90 (referred to as system 100) is shown in greater 
detail in FIG. 6. 

[0050] FIG. 6 is a detailed description of system 100 for decoding a sequence of signals output by a convolutional 
35 encoder and transmitted over a channel according to a preferred embodiment of the invention. 

[0051] Systems 60, 70, 80, 90 and 100 can be implemented as a dedicated hardware accelerator within an embedded 
system, for enhancing for enhancing the performances of embedded systems. 

[0052] System 100 is adapted to calculate lambdas according to method 30 and 50, when R equals ^, 1/3, M, 1/5 
or 1 /6, each stage comprises of 8 nodes, and the length of all windows, except a last window of the trellis, is 64 stages. 
40 System 100 is adapted to perform 8 ACS butterflies calculations in a single clock cycle. 

[0053] For convenience of explanation it is assumed that system 100 implements method 30. If system 100 imple- 
ments method 50 then alpha memory 190 is used to store betas of a window and bus 360 is used to couple beta 
registers 160 - 167 to bus 712. 

[0054] It is assumed that system 1 00 calculates Ak that is associated to a transition from a (k-1 )'th stage of the trellis 
45 to the k'th stage. The (k-1 )'th stage comprising of eight nodes N0,k-1 ; N1 ,k-1 ; N2,k-1 ; N3,k-1 ; N4k-1 ; N5,k-1 ; N6,k-1 ; 

N7, k-1 and the k'th stage has eight nodes N0,k; N1 ,k; N2,k; N3,k; N4,k; N5,k; N6,k and N7,k. The forward metrics of 

the N0,k-1 till N7,k-1 are denoted ot(0,k-1), a(1,k-1), a(2,k-1), a(3,k-1), a(4,k-1), a(5,k-1), a(6,k-1) and a(7,k-1). The 

backward metrics of nodes N0,k till N7,k are denoted P(0,k), p(1,k), P(2,k), |}{3,k), fi(4,k), p(5,k), p(6,k) and (5(7,k). 

Branch metrics -yO.k is assosiated to a transition from (to) N0,k-1 to (from) N4,k and from (to) N1,k-1 to (from) N0,k. 
50 Branch metrics -rQ.k is assosiated to a transition from (to) NO, k-1 to (from) N0,k and from (to) N1,k-1 to (from) N4,k. 

Branch metrics yl.k is assosiated to a transition from (to) N2,k-1 to (from) N1,k and from (to) N3,k-1 to (from) N5,k. 

Branch metrics -yl.k is assosiated to a transition from (to) N2,k-1 to (from) N5,k and from (to) N3,k-1 to (from) N1,k. 

Branch metrics ?2,k is assosiated to a transition from (to) N5,k-1 to (from) N2,k and from (to) N4,k-1 to (from) N6,k. 

Branch metrics -v2,k is assosiated to a transition from (to) N4,k-1 to (from) N2,k and from (to) N5,k-1 to (from) N6,k. 
55 Branch metrics 73,k is assosiated to a transition from (to) N6,k-1 to (from) N3,k and from (to) N7,k-1 to (from) N7,k. 

Branch metrics -v3,k is assosiated to a transition from (to) N6,k-1 to (from) N7,k and from (to) N7,k-1 to (from) N3,k. 

[0055] Branch metrics yO,k • y3,k are given by the following equations : 
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(1) 70.k = -(Lk + Yp1,k + Yp2,k) 

5 (2) Yl ( k = -(Lk-Yp1,k + Yp2,k) 

(3) Y2,k = -(Lk-Yp1,k-Yp2,k) 

10 

(4) y3,k = -(Lk + Yp1,k-Yp2,k) 
[0056] The forward metrics are given by the following equations : 

15 (5) cc(0,k)= MAX[(oc(1 ,k-1 J+yO.k), (oc(0,k-1 )-y0,k)] 

(6) a(1 ,k)= MAX[(a(2,k-1 )+y1 ,k), (a(3 t k-1 )-y1 ,k)] 

(7) a(2,k)= MAX[(a(5,k-1 )+Y2,k), (a(4,k-1 )-y2,k)] 

(8) a(3 f k)= MAX[(a(6,k-1 )+y3,k), (a(7,k-1 )-y3,k)] 

(9) a(4,k)= MAX[(a(0,k-1 )+y0.k), (a(1 ,k-1 )-y0,k)] 

(10) a(5,k)= MAX[(a(3,k-1)+Yl,k), (a(2,k-1)-y1,k)] 

(11) a(6,k)= MAX[(a(4,k-1 )+y2,k), (a(5,k-1 )-y2,k)] 

35 

(12) a(7,k)= MAX[(a(7,k-1)+y3,k), (a(6,k-1)-y3,k)] 
[0057] The backward metrics are given by the following equations : 

40 

(13) P(0,k-1)= MAX[(P(4,k)+Y0,k), (P(0,k)-y0,k)] 



45 (14) P(1.k-1)= MAX[((5(0,k)+y0.k), (P(4,k)-y0,k)] 



20 



25 



30 



(1 5) P(2,k-1 )= MAX[(p(1 ,k)+ Y 1 .k), (P(5,k)-y1 ,k)] 



50 

(16) P(3.k-1)= MAX[(p(5,k)+ Y 1,k), (P(1,k)-Yl,k)] 



(17) p(4.k-1 )= MAX[(p(6,k)+Y2,k), (p(2.k)-Y2.k)] 

55 



(18) P(5.k-1)= MAX[(P(2,k)+Y2,k), (P(6,k)-y2,k)j 
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(19) P(6,k-1)= MAX[(P(3,k)+y3,k), (p(7,k)-y3,k)] 



(20) P(7,k-1 )= MAX[(P(7,k)+y3,k), (p(3,k)-y3,k)] 
[0058] Lambda is given by the following equations : 

(21 ) Ak = (Max(0) - Max(1 ))/2 - Lk 



(22) Max(O) = MAX [(ct(4,k-1 )+p(2,k)-y2,k), (oc(0,k-1 )+P(0,k)-yO,k), 
(a(5,k-1 )+p(6.k)-y2,k), (a(1 ,k-1 )+P(4,k)-yO,k), (a(2,k-1 )+P(3,k)-y3,k), 
(o(3.k-1 )+P(1 ,k)-y1 ,k), (a(6,k-1 )+P(7,k)-y3,k), (a(2,k-1 )+P(5,k)-y1 ,k)]. 



20 (23) Max(1 ) = MAX [(a(4,k-1 )+p(6,k)+y2,k), (a(0,k-1 )+p(4,k)+y0,k), 

(a(5,k-1 )+p(2,k)+y2,k), (a(1 ,k-1 )+P(0,k)+yO,k), (a(6,k-1 )+P(3,k)+y3 ( k), 
(a(2,k-1 )+p(1 ,k)+y1 ,k), (a(7,k-1 )+p(7,k)+y3,k), (a(3,M )+p(5,k)+y1 .k)J. 

25 

[0059] When a Log MAP algorithm is implemented, the calculation involves the addition of a correction factor shat 
is preferably stored in a look up table. The correction factor is not required when a Max Log Map algorithm is implement . 
Such a look up table (not shown) is coupled to ALU0 - ALU7 140 - 147, MAX_0 and MAXJ units 210 and 211 or forms 
a part of each of the mentioned units. 

30 [0060] System 1 00 is coupled to bus 71 2, for exchanging information with external memory module 71 , it is coupled 
to bus 791 for receiving control signals from host processor 79 and is coupled to bus 711 for providing control signals 
and addresses to external memory module 71. System 100 has an address generator and control unit (i.e.- control 
unit) 230 that controls other units of system 1 00 and controls the exchange of information with external memory module. 
Control unit 230 is coupled to the other units of system 230 by bus 330 and just for convenience of explanation the 

35 various connections are not shown. 

[0061] System 100 comprises of registers 103, 102 and 101 for storing Yp1,k, Yp2,k, and a-priori lambda Lk. Gamma 
processor 1 04 for receiving the content of registers 1 01 -1 03., calculating 70,k, yl ,k, y2,k and y3,k according to equasions 
(1 )-(4) and providing them to registers 110-11 3. Gamma memory 1 20 for storing gammas of a window. A- priory lambda 
memory 130 for storing a-priory lambdas of a window. Eight alpha registers 150-157 for storing eight alphas. Eight 

40 beta registers 160-167 for storing eight betas. Eight lambda registers 170-177 for storing eight intermediate results 
that are used to calculate lambda. Eight selection units 180-187, coupled to registers 150-157, alpha memory 190, 
registers 110-113 and gamma memory 120 for providing alphas, betas and gammas to eight arithmetic control units 
ALU0 - ALU7 140 - 147. ALU0 - ALU7 140-147 for implementing equations (5)- (20) and providing the results of their 
calculations to alpha registers 150-157, beta registers 160-167 and lambda registers 170-177. During steps 46 and 

45 66 ALU0 - ALU7 140-147 provide lambda registers 160-167 with eight intermediate results. These results are shown 
in brackets in equations (22) and (23). Alphas are provided to registers 150-157, betas are provided to registers 
160-167. Eight intermediate results are provided to registers 170-177. MAXJ) unit 210 implements equation (22) and 
provides max(0) to adder 220. MAX_1 unit 211 implements equation (23) and provides max(1) to adder 220. Adder 
220 shifts both max{0) and max(1 ) to the right, subtracts max(1 ) from max(0), subtracts Lk from the product and provides 

50 Ak. Selection units 180 - 187 select which variables are provided to ALU0 - ALU7 140 - 147. For example, during a 
calculation of a(0,k) in step 36 selection unit 180 provides ALU0 140 with a(0,k-1) from register 150, a(1 ,k-1) from 
register 151 and y0,k from register 110 so that ALU0 140 can implement equation (5). During a calculation of gamma 
in step 46 selection unit 180 provides ALU0 140 ct(4,k-1) from alpha memory 190, P(2,k) from register 162 and y2,k 
from gamma memory 120. 

55 [0062] Registers 101-103 are coupled to data bus 712 and are coupled to gamma processor 104 via buses 301- 
303. Registers 110 - 113 are coupled to gamma processor 104 via bus 304 and to selection units 180-187 via bus 
311. Gamma memory 120 is coupled to gamma processor via bus 304 and to selection units 180 - 187 via bus 311. 
A-priori lambda memory is coupled to register 101 via bus 301 and to adder 220 via bus 330. Selection units 180-187 
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are coupled to alpha registers 150-157 and beta registers 160-167 via buses 350 and 360 and to gamma memory 
120 via bus 120. Preferably portions of buses 350 t 360, 322 and 311 are coupled to each selection unit. For example, 
selection unit 1 80 is coupled to portions of bus 350 to receive the content of alpha registers 1 50 and 1 51 . Bus 350 and 
360 are coupled to bus 712 and to alpha memory 190. ALU0 - ALU7 140- 147 are coupled to selection units 180 - 

5 187 via buses 380 - 387, to alpha registers 150-157 beta registers 160- 167 and intermediate lambda registers 170 
- 177 via buses 340 - 347. MAX_0 and MAX_1 units 210 and 211 are coupled via buses 370 and 371 to intermediate 
lambda registers 170- 177 and via buses 310 and 311 to adder 220. Adder 220 is coupled to bus 712. 
[0063] During step 34 initial condition of nodes are provided to alpha registers 150-157 and beta registers 160-167 
from external memory module 71 via buses 350 and 360. 

w [0064] During step 36 system 100 calculates gammas and alphas of the whole trellis, for example, it is assumed 
that a(0,k) - a(7,k) and -jO.k - y3,k are calculated. 

[0065] The calculation of -yO.k - y3,k is done by providing Yp1,k, Yp2,k and Lk from registers 101-103 to gamma 
processor 104, implementing equations (1) - (4) and storing the result in registers 110-113. 

[0066] The calculation of alphas <x(0,k-1) - a(7,k-1)) is done by providing ALU0 - ALU7 140 - 147 gammas from 
15 registers 110-113, previous alphas (a(0,k-1) - a(7,k-1)) from alpha registers 150-157 implementing equations (5) - (12) 
and storing alphas a(0,k) - a(7,k) in alpha registers 150 - 157. Alphas of nodes of starting stages of windows are 
provided via buses 350 and 712 to externel memory module 71. 

[0067] During step 42 alphas of nodes of a starting stage of a window are provided via buses 350 and 712 from 
externel memory module 71 to alpha registers 150-157. Preferably, host processor 79 provides the control and address 

20 signals and selects which window to process. 

[0068] During step 44 alphas and gammas of a window are calculated, equations (1)-(4) and (5) - (12) are imple- 
mented by activating regiaters 101-103, gamma processor 104, gamma registers 110-113 selection units 180-187 , 
ALU0- ALU7 140-147 and alpha registers 150-157, as in step 36 but the alphas and gammas of nodes of the window 
are stored in alpha memory 190 and in gamma memory 120 so that when step 44 ends gamma memory 120 stores 

25 the all the gammas of the window and alpha memory 190 stores all the alphas of the window. 
[0069] During step 46 the gammas of a window are calculated. 

[0070] Gammas from gamma memory 120 and betas from bata registers 1 60-167 are used to implement equations 
(13) - (20) and (21) - (23) so that the betas and lambdas of the window are calculated. Equations (13)- (20) are imple- 
mented by providing previous betas from beta registers 160-167 and gammas from gamma memory 120 to selection 

30 units 180-187 and calculating betas. Lambdas are calculated by providing betas from beta registers 160-167, alphas 
from alpha memory 190 and gammas from gamma memory 120 to selection units 180 -187 and to ALU0 - ALU7 
140-147. ALU0-ALU7 140-147 provide eight intermediate results to lambda registers 170 - 177, four intermediate 
results are provided to MAX_0 unit 210 and four are provided to MAX_1 unit 211 for implementing equations (22) and 
(23) and providing max(0) and max(1 ) to adder 220. Adder 220 shifts max(0) and max(1) to the right, subtracts max 

35 (1 )/2 from max(1 )/2 and subtracts from the result an a-priori lambda from memory 1 30 and provides lamdbas to external 
memory via bus 712. 

[0071] It should be noted that the particular terms and expressions employed and the particular structural and op- 
erational details disclosed in the detailed description and accompanying drawings are for illustrative purposes only 
and are not intended to in any way limit the scope of the invention as described in the appended claims. 

40 [0072] Thus, there has been described herein an embodiment including at least one preferred embodiment of an 
improved method and apparatus for implementing a method and a device for performing SISO decoding . It will be 
apparent to those skilled in the art that the disclosed subject matter may be modified in numerous ways and may 
assume many embodiments other then the preferred form specifically set out and described above. 
[0073] Accordingly, the above disclosed subject matter is to be considered illustrative and not restrictive, and to the 

45 maximum extent allowed by law, it is intended by the appended claims to cover all such modifications and other em- 
bodiments which fall within the true spirit and scope of the present invention. The scope of the invention is to be 
determined by the broadest permissible interpretation of the following claims and their equivalents rather then the 
foregoing detailed description. 

50 

Claims 

1. A method for performing SISO decoding, the method comprising the steps of: 

55 (One) providing a trellis representative of an output of a convolutional encoder, the convolutional encoder has 

a coding rate of R, the trellis having a block length T and is divided to windows; 
(Two) assigning an initial conditions to each node of a starting stage and an ending stage of the trellis; 
(Three)computing a forward metric for each node, starting from the starting stage of the trellis and advancing 
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forward through the trellis and storing forward metrics of nodes of a plurality of starting stages of windows; 
(Four) repeating stages d(1) - d(3) until all lambdas of the trellis are calculated : 

(1) retrieving forward metrics of nodes of a starting stage of a window, the retrieved forward metrics were 
5 computed and stored during step 1(c) t 

(2) computing and storing forward metrics for each node, starting from a second stage of the window and 
ending at the ending stage of the window; 

(3) computing backward metrics for each node, starting from the ending stage of the window and ending 
at the starting stage of the window; wherein when backward metrics of nodes of a stage are computed 

10 and the forward metrics of the nodes of an adjacent stage were previously computed, the computation of 

backward metrics is integrated with the computation of lambda from the stage to the adjacent stage and 
a storage of the computed lambdas. 

2. The method of claim 1 wherein starting to compute the lambdas of a last window of the trellis and advancing 
15 backwards through the trellis. 

3. The method of claim 1 wherein gammas are computed during stage d(2). 

4. The method of claim 1 wherein all windows except a last window of the trellis have a length of WN, WN«T. 

20 

5. The method of claim 1 wherein the windows do not overlap. 

6. The method of claim 1 wherein a computation of backward metrics of nodes of an ending stages of a window that 
is not a last window in the trellis, is preceded by a computation of a backward metrics of a following window, wherein 

25 the starting stage of the following window follows the ending stage of the window. 

7. The method of claim 1 wherein step (c) involves storing the forward metrics of nodes of the starting stages of 
windows in an external memory module. 

30 8. The method of claim 1 wherein step d(3) involves storing the lambdas in an external memory module. 

9. The method of claim 1 wherein step d(2) involves storing the forward metrics in an internal memory module. 

10. The method of claim 1 wherein the method is used to implement one of the Log MAP algorithms. 

35 

11. A method for performing SISO decoding, the method comprising the steps of: 

(One) providing a trellis representative of an output of a convolutional encoder, the convolutional encoder has 
a coding rate of R, the trellis having a block length T; 
40 (Two) assigning an initial conditions to each node of an ending stage and a starting stage of the trellis; 

(Three)computing a backward metric for each node, starting from nodes of the ending stage of the trellis and 
advancing backward through the trellis and storing backward metrics of nodes of a plurality of ending stages 
of windows; 

(Four) repeating stages d(1) - d(3) until all lambdas of the trellis are calculated: 

45 

(1 ) retrieving backward metrics of nodes of an ending stage of a window, the retrieved backward metrics 
were computed and stored during step 1(c), 

(2) computing and storing backward metrics for each node, starting from a stage that precedes the ending 
stage of the window and ending at the first stage of the window; 

so (3) computing forward metrics for each node, starting from the starting stage of the window and ending 

at the ending stage of the window; wherein when forward metrics of nodes of a stage are computed and 
the backward metrics of the nodes of an adjacent stage were previously computed, the computation of 
forward metrics is integrated with the computation of lambda from the stage to the adjacent stage and a 
storage of the computed lambdas 

55 

12. The method of claim 11 wherein computing the lambdas of a first window of the trellis and advancing forwards 
through the trellis. 
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13. The method of claim 11 wherein gammas are computed during stage d(2). 

14. The method of claim 11 wherein all windows except a last window of the trellis have a length of WN, WN<T. 
5 15. The method of claim 11 wherein the windows do not overlap. 

16. The method of claim 11 wherein a computation of forward metrics of nodes of a starting stage of a window that is 
not a first window in the trellis, is preceded by a computation of a forward metrics of a preceding window, wherein 
the ending stage of the preceding window is followed by the starting stage of the window. 

10 

17. The method of claim 11 wherein step (c) involves storing the backward metrics of nodes of the ending stages of 
windows in an external memory module. 

18. The method of claim 11 wherein step d(3) involves storing the lambdas in an external memory module. 

15 

19. The method of claim 11 wherein step d(2) involves storing the backward metrics in an internal memory module. 

20. The method of claim 11 wherein the method is used to implement one of the Log MAP algorithms. 

20 21. A system for decoding a sequence of signals output by a convolutional encoder and transmitted over a channel, 
the encoder output represented by a trellis having a block length T, the system comprising : 

an internal memory, for storing a plurality of variables that are required for calculating lambdas of a window; 

an external memory module, adapted to store a plurality of variables the are required for calculating lambda 
25 of the trellis; and 

a processor, coupled to the external memory and the internal memory for calculating forward metrics, backward 

metrics, branch metrics and lambdas and for accessing the external and internal memory modules; 

wherein the system is adapted to calculate the forward metrics of all the trellis, store forward metrics of nodes 

of starting stages of windows, calculate forward metrics and branch metrics of a window, store the forward 
30 metrics and branch metrics in the internal memory module, use the forward metrics and branch metrics within 

the internal memory module to calculate lambdas of the window, whereas the system calculates the lambdas 

of various windows until all lambdas of the trellis are calculated. 

22. The system of claim 22 wherein the system starts to compute the lambdas of a last window of the trellis and 
35 advances backwards through the trellis. 

23. The system of claim 22 wherein all windows except a last window of the trellis have a length of WN, WN«T. 

24. The system of claim 22 wherein the windows do not overlap. 

40 

25. The system of claim 22 wherein the system computes backward metrics of nodes of an ending stages of a window 
that is not a last window in the trellis, after the system calculates a backward metrics of a following window, wherein 
the starting stage of the following window follows the ending stage of the window. 

45 26. The system of claim 22 wherein the system stores the forward metrics of nodes of starting stages of windows in 
the external memory module. 

27. The system of claim 22 wherein the system stores the lambdas it calculates in the external memory module. 

50 28. The system of claim 22 wherein the system stores the forward metrics and the branch metrics of a window in the 
internal memory module. 

29. The system of claim 22 wherein the system is used to implement one of the Log MAP algorithms. 

55 30. The system of claim 22 wherein the processor and the internal memory further comprising of: 

a gamma calculator, coupled to the external memory module and to the internal memory module, for receiving 
received signals and an a-priori lambda and for calculating branch metrics; 
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a gamma register file, coupled to the gamma processor, for storing branch metrics calculated by the gamma 
register; 

a processor register file, coupled to an abc processor and to the external memory module for storing forward 
metrics, backward metrics and intermediate results to be provided to abc processor or the external memory 
5 module; and 

an abc processor, coupled to the gamma register file, the internal memory module, the external memory mod- 
ule, for receiving gammas and either one of forward metrics or backward metrics stored within the processor 
register file and calculating either forward metrics, backward metrics or lambdas. 

10 31 . The system of claim 22 wherein the abc processor further comprises of: 

a plurality of selection units, coupled to the processor register file, to the gamma register file and to a plurality 

of arithmetic register file, for selecting the input signals to be inputted to the arithmetic logic units; 

a plurality of arithmetic control unit, for receiving inputs from the plurality of selection units, for calculating 

*5 forward metrics, backward metrics and intermediate results and for storing them in the processor register file; 

a plurality of MAX units, for finding two maximal intermediate results, whereas the first maximal intermediate 
result is associated to transitions in the trellis caused by a transmission of T and a second first maximal 
intermediate result is associated to transitions in the trellis caused by a transmission of "0"; 
an adder, for generating lambda by subtracting an a-priori lambda and a half of the first maximal result from 

20 a half of the second maximal result. 

32. The system of claim 31 wherein the processor register file comprises of a plurality of alpha registers for storing 
forward metrics, a plurality of beta register files for storing forward metrics and a plurality of lambda registers for 
storing intermediate results. 

25 

33. A system for decoding a sequence of signals output by a convolutiohal encoder and transmitted over a channel, 
the encoder output represented by a trellis having a block length T, the system comprising : 

an internal memory, for storing a plurality of variables that are required for calculating lambdas of a window; 
30 an external memory module, adapted to store a plurality of variables the are required for calculating lambda 

of the trellis; and 

a processor, coupled to the external memory and the internal memory for calculating forward metrics, backward 
metrics, branch metrics and lambdas and for accessing the external and internal memory modules; 
wherein the system is adapted to calculate the backward metrics of all the trellis, store backward metrics of 
35 nodes of ending stages of windows, calculate backward metrics and branch metrics of a window, store the 

backward metrics and branch metrics in the internal memory module, use the backward metrics and branch 
metrics within the internal memory module to calculate lambdas of the window, whereas the system calculates 
the lambdas of various windows until all lambdas of the trellis are calculated. 

40 34. The system of claim 33 wherein the system starts to compute the lambdas of a first window of the trellis and 
advances forwards through the trellis. 

35. The system of claim 33 wherein all windows except a first window of the trellis have a length of WN, WIM«T. 

45 36. The system of claim 33 wherein the windows do not overlap. 

37. The system of claim 33 wherein the system computes a forward metrics of nodes of an starting stages of a window 
that is not a first window in the trellis, after the system calculates a forward metrics of a previous window, wherein 
the ending stage of the previous window precedes the starting stage of the window. 

50 

38. The system of claim 33 wherein the system stores the backward metrics of nodes of starting stages of windows 
in the external memory module. 

39. The system of claim 33 wherein the system stores the lamdbas it calculates in the external memory module. 

55 

40. The system of claim 33 wherein the system stores the backward metrics and the branch metrics of a window in 
the internal memory module. 
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41. The system of claim 33 wherein the system is used to implement one of the Log MAP algorithms. 

42. The system of claim 33 wherein the processor and the internal memory further comprise of: 

a gamma calculator, coupled to the external memory module and to the internal memory module, for receiving 
received signals and an a-priori lambda and for calculating branch metrics; 

a gamma register file, coupled to the gamma processor, for storing branch metrics calculated by the gamma 
register; 

a processor register file, coupled to an abc processor and to the external memory module for storing forward 
metrics, backward metrics and intermediate results to be provided to abc processor or the external memory 
module; and 

an abc processor, coupled to the gamma register file, the internal memory module, the external memory mod- 
ule, for receiving gammas and either one of forward metrics or backward metrics stored within the processor 
register file and calculating either forward metrics, backward metrics or lambdas. 

43. The system of claim 42 wherein the abc processor further comprises of: 

a plurality of selection units, coupled to the processor register file, to the gamma register file and to a plurality 

of arithmetic register file, for selecting the input signals to be inputted to the arithmetic logic units; 

a plurality of arithmetic control unit, for receiving inputs from the plurality of selection units, for calculating 

forward metrics, backward metrics and intermediate results and for storing them in the processor register file; 

a plurality of MAX units, for finding two maximal intermediate results, whereas the first maximal intermediate 

result is associated to transitions in the trellis caused by a transmission of "1" and a second first maximal 

intermediate result is associated to transitions in the trellis caused by a transmission of "0"; 

an adder, for generating lambda by subtracting an a-priori lambda and a half of the first maximal result from 

the half of the second maximal result. 

44. The system of claim 43 wherein the processor register file comprises of a plurality of alpha registers for storing 
forward metrics, a plurality of beta register files for storing forward metrics and a plurality of lambda registers for 
storing intermediate results. 
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